Dataset statistics
| Number of variables | 17 |
|---|---|
| Number of observations | 1000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 132.9 KiB |
| Average record size in memory | 136.1 B |
Variable types
| CAT | 10 |
|---|---|
| NUM | 7 |
Reproduction
| Analysis started | 2020-08-10 12:30:16.955290 |
|---|---|
| Analysis finished | 2020-08-10 12:30:29.832120 |
| Duration | 12.88 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
gross margin percentage has constant value "4.7619047619999995" | Constant |
Date has a high cardinality: 89 distinct values | High cardinality |
Time has a high cardinality: 506 distinct values | High cardinality |
Total is highly correlated with Tax 5% and 2 other fields | High correlation |
Tax 5% is highly correlated with Total and 2 other fields | High correlation |
cogs is highly correlated with Tax 5% and 2 other fields | High correlation |
gross income is highly correlated with Tax 5% and 2 other fields | High correlation |
City is highly correlated with Branch | High correlation |
Branch is highly correlated with City | High correlation |
Time is uniformly distributed | Uniform |
Invoice ID has unique values | Unique |
| Distinct count | 1000 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 KiB |
| 166-19-2553 | 1 |
|---|---|
| 244-08-0162 | 1 |
| 468-99-7231 | 1 |
| 712-39-0363 | 1 |
| 575-30-8091 | 1 |
| Other values (995) |
| Value | Count | Frequency (%) | |
| 166-19-2553 | 1 | 0.1% | |
| 244-08-0162 | 1 | 0.1% | |
| 468-99-7231 | 1 | 0.1% | |
| 712-39-0363 | 1 | 0.1% | |
| 575-30-8091 | 1 | 0.1% | |
| 232-16-2483 | 1 | 0.1% | |
| 573-58-9734 | 1 | 0.1% | |
| 529-56-3974 | 1 | 0.1% | |
| 401-18-8016 | 1 | 0.1% | |
| 585-11-6748 | 1 | 0.1% | |
| Other values (990) | 990 | 99.0% |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 11 |
| Min length | 11 |
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 KiB |
| A | |
|---|---|
| B | |
| C |
| Value | Count | Frequency (%) | |
| A | 340 | 34.0% | |
| B | 332 | 33.2% | |
| C | 328 | 32.8% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 KiB |
| Yangon | |
|---|---|
| Mandalay | |
| Naypyitaw |
| Value | Count | Frequency (%) | |
| Yangon | 340 | 34.0% | |
| Mandalay | 332 | 33.2% | |
| Naypyitaw | 328 | 32.8% |
Length
| Max length | 9 |
|---|---|
| Median length | 8 |
| Mean length | 7.648 |
| Min length | 6 |
Customer type
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 KiB |
| Member | |
|---|---|
| Normal |
| Value | Count | Frequency (%) | |
| Member | 501 | 50.1% | |
| Normal | 499 | 49.9% |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Gender
Categorical
| Distinct count | 2 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 KiB |
| Female | |
|---|---|
| Male |
| Value | Count | Frequency (%) | |
| Female | 501 | 50.1% | |
| Male | 499 | 49.9% |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 5.002 |
| Min length | 4 |
Product line
Categorical
| Distinct count | 6 |
|---|---|
| Unique (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 KiB |
| Fashion accessories | |
|---|---|
| Food and beverages | |
| Electronic accessories | |
| Sports and travel | |
| Home and lifestyle |
| Value | Count | Frequency (%) | |
| Fashion accessories | 178 | 17.8% | |
| Food and beverages | 174 | 17.4% | |
| Electronic accessories | 170 | 17.0% | |
| Sports and travel | 166 | 16.6% | |
| Home and lifestyle | 160 | 16.0% | |
| Health and beauty | 152 | 15.2% |
Length
| Max length | 22 |
|---|---|
| Median length | 18 |
| Mean length | 18.54 |
| Min length | 17 |
Unit price
Real number (ℝ≥0)
| Distinct count | 943 |
|---|---|
| Unique (%) | 94.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 55.67213 |
|---|---|
| Minimum | 10.08 |
| Maximum | 99.96 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.8 KiB |
Quantile statistics
| Minimum | 10.08 |
|---|---|
| 5-th percentile | 15.279 |
| Q1 | 32.875 |
| median | 55.23 |
| Q3 | 77.935 |
| 95-th percentile | 97.222 |
| Maximum | 99.96 |
| Range | 89.88 |
| Interquartile range (IQR) | 45.06 |
Descriptive statistics
| Standard deviation | 26.49462835 |
|---|---|
| Coefficient of variation (CV) | 0.4759047004 |
| Kurtosis | -1.218591428 |
| Mean | 55.67213 |
| Median Absolute Deviation (MAD) | 22.505 |
| Skewness | 0.007077447853 |
| Sum | 55672.13 |
| Variance | 701.9653313 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 83.77 | 3 | 0.3% | |
| 64.08 | 2 | 0.2% | |
| 32.32 | 2 | 0.2% | |
| 21.58 | 2 | 0.2% | |
| 45.38 | 2 | 0.2% | |
| 48.5 | 2 | 0.2% | |
| 26.26 | 2 | 0.2% | |
| 21.12 | 2 | 0.2% | |
| 39.75 | 2 | 0.2% | |
| 24.74 | 2 | 0.2% | |
| Other values (933) | 979 | 97.9% |
| Value | Count | Frequency (%) | |
| 10.08 | 1 | 0.1% | |
| 10.13 | 1 | 0.1% | |
| 10.16 | 1 | 0.1% | |
| 10.17 | 1 | 0.1% | |
| 10.18 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 99.96 | 2 | 0.2% | |
| 99.92 | 1 | 0.1% | |
| 99.89 | 1 | 0.1% | |
| 99.83 | 1 | 0.1% | |
| 99.82 | 2 | 0.2% |
Quantity
Real number (ℝ≥0)
| Distinct count | 10 |
|---|---|
| Unique (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.51 |
|---|---|
| Minimum | 1 |
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.8 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 5 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 2.923430595 |
|---|---|
| Coefficient of variation (CV) | 0.5305681661 |
| Kurtosis | -1.215547226 |
| Mean | 5.51 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.01294104802 |
| Sum | 5510 |
| Variance | 8.546446446 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 10 | 119 | 11.9% | |
| 1 | 112 | 11.2% | |
| 4 | 109 | 10.9% | |
| 7 | 102 | 10.2% | |
| 5 | 102 | 10.2% | |
| 6 | 98 | 9.8% | |
| 9 | 92 | 9.2% | |
| 2 | 91 | 9.1% | |
| 3 | 90 | 9.0% | |
| 8 | 85 | 8.5% |
| Value | Count | Frequency (%) | |
| 1 | 112 | 11.2% | |
| 2 | 91 | 9.1% | |
| 3 | 90 | 9.0% | |
| 4 | 109 | 10.9% | |
| 5 | 102 | 10.2% |
| Value | Count | Frequency (%) | |
| 10 | 119 | 11.9% | |
| 9 | 92 | 9.2% | |
| 8 | 85 | 8.5% | |
| 7 | 102 | 10.2% | |
| 6 | 98 | 9.8% |
| Distinct count | 990 |
|---|---|
| Unique (%) | 99.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.379368999999999 |
|---|---|
| Minimum | 0.5085 |
| Maximum | 49.65 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.8 KiB |
Quantile statistics
| Minimum | 0.5085 |
|---|---|
| 5-th percentile | 1.955725 |
| Q1 | 5.924875 |
| median | 12.088 |
| Q3 | 22.44525 |
| 95-th percentile | 39.1665 |
| Maximum | 49.65 |
| Range | 49.1415 |
| Interquartile range (IQR) | 16.520375 |
Descriptive statistics
| Standard deviation | 11.70882548 |
|---|---|
| Coefficient of variation (CV) | 0.7613332823 |
| Kurtosis | -0.0818847579 |
| Mean | 15.379369 |
| Median Absolute Deviation (MAD) | 7.50875 |
| Skewness | 0.892569805 |
| Sum | 15379.369 |
| Variance | 137.0965941 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 22.428 | 2 | 0.2% | |
| 8.377 | 2 | 0.2% | |
| 13.188 | 2 | 0.2% | |
| 4.464 | 2 | 0.2% | |
| 9.0045 | 2 | 0.2% | |
| 12.57 | 2 | 0.2% | |
| 10.326 | 2 | 0.2% | |
| 4.154 | 2 | 0.2% | |
| 39.48 | 2 | 0.2% | |
| 10.3635 | 2 | 0.2% | |
| Other values (980) | 980 | 98.0% |
| Value | Count | Frequency (%) | |
| 0.5085 | 1 | 0.1% | |
| 0.6045 | 1 | 0.1% | |
| 0.627 | 1 | 0.1% | |
| 0.639 | 1 | 0.1% | |
| 0.699 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 49.65 | 1 | 0.1% | |
| 49.49 | 1 | 0.1% | |
| 49.26 | 1 | 0.1% | |
| 48.75 | 1 | 0.1% | |
| 48.69 | 1 | 0.1% |
| Distinct count | 990 |
|---|---|
| Unique (%) | 99.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 322.96674899999994 |
|---|---|
| Minimum | 10.6785 |
| Maximum | 1042.65 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.8 KiB |
Quantile statistics
| Minimum | 10.6785 |
|---|---|
| 5-th percentile | 41.070225 |
| Q1 | 124.422375 |
| median | 253.848 |
| Q3 | 471.35025 |
| 95-th percentile | 822.4965 |
| Maximum | 1042.65 |
| Range | 1031.9715 |
| Interquartile range (IQR) | 346.927875 |
Descriptive statistics
| Standard deviation | 245.8853351 |
|---|---|
| Coefficient of variation (CV) | 0.7613332823 |
| Kurtosis | -0.0818847579 |
| Mean | 322.966749 |
| Median Absolute Deviation (MAD) | 157.68375 |
| Skewness | 0.892569805 |
| Sum | 322966.749 |
| Variance | 60459.59802 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 175.917 | 2 | 0.2% | |
| 829.08 | 2 | 0.2% | |
| 189.0945 | 2 | 0.2% | |
| 470.988 | 2 | 0.2% | |
| 93.744 | 2 | 0.2% | |
| 216.846 | 2 | 0.2% | |
| 276.948 | 2 | 0.2% | |
| 87.234 | 2 | 0.2% | |
| 217.6335 | 2 | 0.2% | |
| 263.97 | 2 | 0.2% | |
| Other values (980) | 980 | 98.0% |
| Value | Count | Frequency (%) | |
| 10.6785 | 1 | 0.1% | |
| 12.6945 | 1 | 0.1% | |
| 13.167 | 1 | 0.1% | |
| 13.419 | 1 | 0.1% | |
| 14.679 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 1042.65 | 1 | 0.1% | |
| 1039.29 | 1 | 0.1% | |
| 1034.46 | 1 | 0.1% | |
| 1023.75 | 1 | 0.1% | |
| 1022.49 | 1 | 0.1% |
| Distinct count | 89 |
|---|---|
| Unique (%) | 8.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 KiB |
| 2/7/2019 | 20 |
|---|---|
| 2/15/2019 | 19 |
| 3/2/2019 | 18 |
| 3/14/2019 | 18 |
| 1/8/2019 | 18 |
| Other values (84) |
| Value | Count | Frequency (%) | |
| 2/7/2019 | 20 | 2.0% | |
| 2/15/2019 | 19 | 1.9% | |
| 3/2/2019 | 18 | 1.8% | |
| 3/14/2019 | 18 | 1.8% | |
| 1/8/2019 | 18 | 1.8% | |
| 1/25/2019 | 17 | 1.7% | |
| 1/26/2019 | 17 | 1.7% | |
| 1/23/2019 | 17 | 1.7% | |
| 3/5/2019 | 17 | 1.7% | |
| 3/9/2019 | 16 | 1.6% | |
| Other values (79) | 823 | 82.3% |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.677 |
| Min length | 8 |
| Distinct count | 506 |
|---|---|
| Unique (%) | 50.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 KiB |
| 19:48 | 7 |
|---|---|
| 14:42 | 7 |
| 17:38 | 6 |
| 11:51 | 5 |
| 17:16 | 5 |
| Other values (501) |
| Value | Count | Frequency (%) | |
| 19:48 | 7 | 0.7% | |
| 14:42 | 7 | 0.7% | |
| 17:38 | 6 | 0.6% | |
| 11:51 | 5 | 0.5% | |
| 17:16 | 5 | 0.5% | |
| 19:44 | 5 | 0.5% | |
| 13:48 | 5 | 0.5% | |
| 17:36 | 5 | 0.5% | |
| 13:58 | 5 | 0.5% | |
| 19:30 | 5 | 0.5% | |
| Other values (496) | 945 | 94.5% |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Payment
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 KiB |
| Ewallet | |
|---|---|
| Cash | |
| Credit card |
| Value | Count | Frequency (%) | |
| Ewallet | 345 | 34.5% | |
| Cash | 344 | 34.4% | |
| Credit card | 311 | 31.1% |
Length
| Max length | 11 |
|---|---|
| Median length | 7 |
| Mean length | 7.212 |
| Min length | 4 |
| Distinct count | 990 |
|---|---|
| Unique (%) | 99.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 307.58738 |
|---|---|
| Minimum | 10.17 |
| Maximum | 993.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.8 KiB |
Quantile statistics
| Minimum | 10.17 |
|---|---|
| 5-th percentile | 39.1145 |
| Q1 | 118.4975 |
| median | 241.76 |
| Q3 | 448.905 |
| 95-th percentile | 783.33 |
| Maximum | 993 |
| Range | 982.83 |
| Interquartile range (IQR) | 330.4075 |
Descriptive statistics
| Standard deviation | 234.1765096 |
|---|---|
| Coefficient of variation (CV) | 0.7613332823 |
| Kurtosis | -0.0818847579 |
| Mean | 307.58738 |
| Median Absolute Deviation (MAD) | 150.175 |
| Skewness | 0.892569805 |
| Sum | 307587.38 |
| Variance | 54838.63766 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 263.76 | 2 | 0.2% | |
| 448.56 | 2 | 0.2% | |
| 207.27 | 2 | 0.2% | |
| 180.09 | 2 | 0.2% | |
| 206.52 | 2 | 0.2% | |
| 83.08 | 2 | 0.2% | |
| 167.54 | 2 | 0.2% | |
| 251.4 | 2 | 0.2% | |
| 89.28 | 2 | 0.2% | |
| 789.6 | 2 | 0.2% | |
| Other values (980) | 980 | 98.0% |
| Value | Count | Frequency (%) | |
| 10.17 | 1 | 0.1% | |
| 12.09 | 1 | 0.1% | |
| 12.54 | 1 | 0.1% | |
| 12.78 | 1 | 0.1% | |
| 13.98 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 993 | 1 | 0.1% | |
| 989.8 | 1 | 0.1% | |
| 985.2 | 1 | 0.1% | |
| 975 | 1 | 0.1% | |
| 973.8 | 1 | 0.1% |
| Distinct count | 1 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.8 KiB |
| 4.761904762 |
|---|
| Value | Count | Frequency (%) | |
| 4.761904762 | 1000 | 100.0% |
Length
| Max length | 18 |
|---|---|
| Median length | 18 |
| Mean length | 18 |
| Min length | 18 |
| Distinct count | 990 |
|---|---|
| Unique (%) | 99.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.379368999999999 |
|---|---|
| Minimum | 0.5085 |
| Maximum | 49.65 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.8 KiB |
Quantile statistics
| Minimum | 0.5085 |
|---|---|
| 5-th percentile | 1.955725 |
| Q1 | 5.924875 |
| median | 12.088 |
| Q3 | 22.44525 |
| 95-th percentile | 39.1665 |
| Maximum | 49.65 |
| Range | 49.1415 |
| Interquartile range (IQR) | 16.520375 |
Descriptive statistics
| Standard deviation | 11.70882548 |
|---|---|
| Coefficient of variation (CV) | 0.7613332823 |
| Kurtosis | -0.0818847579 |
| Mean | 15.379369 |
| Median Absolute Deviation (MAD) | 7.50875 |
| Skewness | 0.892569805 |
| Sum | 15379.369 |
| Variance | 137.0965941 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 22.428 | 2 | 0.2% | |
| 8.377 | 2 | 0.2% | |
| 13.188 | 2 | 0.2% | |
| 4.464 | 2 | 0.2% | |
| 9.0045 | 2 | 0.2% | |
| 12.57 | 2 | 0.2% | |
| 10.326 | 2 | 0.2% | |
| 4.154 | 2 | 0.2% | |
| 39.48 | 2 | 0.2% | |
| 10.3635 | 2 | 0.2% | |
| Other values (980) | 980 | 98.0% |
| Value | Count | Frequency (%) | |
| 0.5085 | 1 | 0.1% | |
| 0.6045 | 1 | 0.1% | |
| 0.627 | 1 | 0.1% | |
| 0.639 | 1 | 0.1% | |
| 0.699 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 49.65 | 1 | 0.1% | |
| 49.49 | 1 | 0.1% | |
| 49.26 | 1 | 0.1% | |
| 48.75 | 1 | 0.1% | |
| 48.69 | 1 | 0.1% |
Rating
Real number (ℝ≥0)
| Distinct count | 61 |
|---|---|
| Unique (%) | 6.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.9727 |
|---|---|
| Minimum | 4.0 |
| Maximum | 10.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 7.8 KiB |
Quantile statistics
| Minimum | 4 |
|---|---|
| 5-th percentile | 4.295 |
| Q1 | 5.5 |
| median | 7 |
| Q3 | 8.5 |
| 95-th percentile | 9.7 |
| Maximum | 10 |
| Range | 6 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 1.718580294 |
|---|---|
| Coefficient of variation (CV) | 0.2464727142 |
| Kurtosis | -1.151586839 |
| Mean | 6.9727 |
| Median Absolute Deviation (MAD) | 1.5 |
| Skewness | 0.009009648766 |
| Sum | 6972.7 |
| Variance | 2.953518228 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 6 | 26 | 2.6% | |
| 6.6 | 24 | 2.4% | |
| 9.5 | 22 | 2.2% | |
| 4.2 | 22 | 2.2% | |
| 8 | 21 | 2.1% | |
| 6.2 | 21 | 2.1% | |
| 6.5 | 21 | 2.1% | |
| 5 | 21 | 2.1% | |
| 5.1 | 21 | 2.1% | |
| 7 | 20 | 2.0% | |
| Other values (51) | 781 | 78.1% |
| Value | Count | Frequency (%) | |
| 4 | 11 | 1.1% | |
| 4.1 | 17 | 1.7% | |
| 4.2 | 22 | 2.2% | |
| 4.3 | 18 | 1.8% | |
| 4.4 | 17 | 1.7% |
| Value | Count | Frequency (%) | |
| 10 | 5 | 0.5% | |
| 9.9 | 16 | 1.6% | |
| 9.8 | 19 | 1.9% | |
| 9.7 | 14 | 1.4% | |
| 9.6 | 17 | 1.7% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| Invoice ID | Branch | City | Customer type | Gender | Product line | Unit price | Quantity | Tax 5% | Total | Date | Time | Payment | cogs | gross margin percentage | gross income | Rating | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 750-67-8428 | A | Yangon | Member | Female | Health and beauty | 74.69 | 7 | 26.1415 | 548.9715 | 1/5/2019 | 13:08 | Ewallet | 522.83 | 4.761905 | 26.1415 | 9.1 |
| 1 | 226-31-3081 | C | Naypyitaw | Normal | Female | Electronic accessories | 15.28 | 5 | 3.8200 | 80.2200 | 3/8/2019 | 10:29 | Cash | 76.40 | 4.761905 | 3.8200 | 9.6 |
| 2 | 631-41-3108 | A | Yangon | Normal | Male | Home and lifestyle | 46.33 | 7 | 16.2155 | 340.5255 | 3/3/2019 | 13:23 | Credit card | 324.31 | 4.761905 | 16.2155 | 7.4 |
| 3 | 123-19-1176 | A | Yangon | Member | Male | Health and beauty | 58.22 | 8 | 23.2880 | 489.0480 | 1/27/2019 | 20:33 | Ewallet | 465.76 | 4.761905 | 23.2880 | 8.4 |
| 4 | 373-73-7910 | A | Yangon | Normal | Male | Sports and travel | 86.31 | 7 | 30.2085 | 634.3785 | 2/8/2019 | 10:37 | Ewallet | 604.17 | 4.761905 | 30.2085 | 5.3 |
| 5 | 699-14-3026 | C | Naypyitaw | Normal | Male | Electronic accessories | 85.39 | 7 | 29.8865 | 627.6165 | 3/25/2019 | 18:30 | Ewallet | 597.73 | 4.761905 | 29.8865 | 4.1 |
| 6 | 355-53-5943 | A | Yangon | Member | Female | Electronic accessories | 68.84 | 6 | 20.6520 | 433.6920 | 2/25/2019 | 14:36 | Ewallet | 413.04 | 4.761905 | 20.6520 | 5.8 |
| 7 | 315-22-5665 | C | Naypyitaw | Normal | Female | Home and lifestyle | 73.56 | 10 | 36.7800 | 772.3800 | 2/24/2019 | 11:38 | Ewallet | 735.60 | 4.761905 | 36.7800 | 8.0 |
| 8 | 665-32-9167 | A | Yangon | Member | Female | Health and beauty | 36.26 | 2 | 3.6260 | 76.1460 | 1/10/2019 | 17:15 | Credit card | 72.52 | 4.761905 | 3.6260 | 7.2 |
| 9 | 692-92-5582 | B | Mandalay | Member | Female | Food and beverages | 54.84 | 3 | 8.2260 | 172.7460 | 2/20/2019 | 13:27 | Credit card | 164.52 | 4.761905 | 8.2260 | 5.9 |
Last rows
| Invoice ID | Branch | City | Customer type | Gender | Product line | Unit price | Quantity | Tax 5% | Total | Date | Time | Payment | cogs | gross margin percentage | gross income | Rating | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 990 | 886-18-2897 | A | Yangon | Normal | Female | Food and beverages | 56.56 | 5 | 14.1400 | 296.9400 | 3/22/2019 | 19:06 | Credit card | 282.80 | 4.761905 | 14.1400 | 4.5 |
| 991 | 602-16-6955 | B | Mandalay | Normal | Female | Sports and travel | 76.60 | 10 | 38.3000 | 804.3000 | 1/24/2019 | 18:10 | Ewallet | 766.00 | 4.761905 | 38.3000 | 6.0 |
| 992 | 745-74-0715 | A | Yangon | Normal | Male | Electronic accessories | 58.03 | 2 | 5.8030 | 121.8630 | 3/10/2019 | 20:46 | Ewallet | 116.06 | 4.761905 | 5.8030 | 8.8 |
| 993 | 690-01-6631 | B | Mandalay | Normal | Male | Fashion accessories | 17.49 | 10 | 8.7450 | 183.6450 | 2/22/2019 | 18:35 | Ewallet | 174.90 | 4.761905 | 8.7450 | 6.6 |
| 994 | 652-49-6720 | C | Naypyitaw | Member | Female | Electronic accessories | 60.95 | 1 | 3.0475 | 63.9975 | 2/18/2019 | 11:40 | Ewallet | 60.95 | 4.761905 | 3.0475 | 5.9 |
| 995 | 233-67-5758 | C | Naypyitaw | Normal | Male | Health and beauty | 40.35 | 1 | 2.0175 | 42.3675 | 1/29/2019 | 13:46 | Ewallet | 40.35 | 4.761905 | 2.0175 | 6.2 |
| 996 | 303-96-2227 | B | Mandalay | Normal | Female | Home and lifestyle | 97.38 | 10 | 48.6900 | 1022.4900 | 3/2/2019 | 17:16 | Ewallet | 973.80 | 4.761905 | 48.6900 | 4.4 |
| 997 | 727-02-1313 | A | Yangon | Member | Male | Food and beverages | 31.84 | 1 | 1.5920 | 33.4320 | 2/9/2019 | 13:22 | Cash | 31.84 | 4.761905 | 1.5920 | 7.7 |
| 998 | 347-56-2442 | A | Yangon | Normal | Male | Home and lifestyle | 65.82 | 1 | 3.2910 | 69.1110 | 2/22/2019 | 15:33 | Cash | 65.82 | 4.761905 | 3.2910 | 4.1 |
| 999 | 849-09-3807 | A | Yangon | Member | Female | Fashion accessories | 88.34 | 7 | 30.9190 | 649.2990 | 2/18/2019 | 13:28 | Cash | 618.38 | 4.761905 | 30.9190 | 6.6 |